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Description of the Invention ''A METHOD FOR 
THE HARDWARE IMPLEMENTATION OF THE IDEA CRYPTOGRAPHIC 
ALGORITHM - HiPCrypto" 
TECHNICAL FIELD 

5 HIPCrypto is a hardware architecture proposal 

for the IDEA cryptographic algorithm, in which were used 
techniques for the exploitation of spatial and temporal 
parallelism, in order to reach the processing speeds 
required by real time applications and high speed data 

10 communication networks such as ATM. 

Nowadays, a world tendency exists for the use 
of networks that provide different types of 
Telecommunication services such as the Integrated Service 
Data Network (ISDN) . These types of networks should provide 

15 a wide range of services from telephone and cable TV to 
video conference- 

The technological progress of transmission 
data networks pushed the development of cryptographic 
algorithm that became progressively more complex and 

20 robust. They are widely used by private and governmental 
organizations as well as individuals that need to ensure 
secrecy in data communication* 

The increasing complexity of recent 
cryptographic algorithms require high processing 

25 capabilities due to the large number of arithmetic and 
logic operations that have to be executed, in some cases 
for real time applications like in video confereces. 
PREVIOUS TECHNIQUES 

Direct hardware implementation of 

30 cryptographic algorithms can ensure high processing speeds 
required by current and future applications in data 
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transmission and eliminate a potential bottleneck in data 
coromunication networks that require high security levels. 

Consequently/ several cryptographic 

algorithms were totally or partially implemented as 
5 Application Specific Integrated Circuits. 

Several hardware and software implementations 
have been developed in the past decade for the Data 
Encryption Standard (DES) , the most popular private key. 
cryptographic algorithm. Table 1 shows the performance 

10 obtained for some software implementations in different 
platforms. Table 2 shows the performance obtained for some 
dedicated hardware implementations. From table 2 one can 
see that the 6868 integrated circuit from VLSI Technology 
reaches up to 512Mbit/s, which is not sufficient to support 

15 some high end ATM applications. Futhermore, it 
cryptanalysis on DES proved that it is weaker than some 
recent private key cryptographic algorithms like IDEA. Few 
hardware implementations of IDEA or its predecessors were 
reported in the litterature. For example, an ASIC that 

20 implements the PES algorithm, which originated IDEA, has 
reached up to 55 Mbits/s at 25 MHz* 
DETAILED DESCRIPTION 
IDEA cryptographic algorithm 

The first form of the IDEA algorithm, was 

25 created by " Xuejia Lai and James Massey *• in 1990 
(US05214703 patent) and was called PES (Proposed Encryption 
Standard) - In 1991, the algorithm was strengthened and was 
called IPES (Improved Proposed Standard Encryption). In 
1992 IPES was called IDEA (International Data Encryption 

30 Algorithm) , and is actually considered by many specialists 
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in the field of cryptography as the strongest existing 
symmetrical algorithm » 

IDEA is a symmetric, block-oriented 
cryptographic algorithm, which uses 128-bit keys (thus 
5 making it practically immune to brute-force attacks) and 
64-bit plaintext blocks. IDEA is build upon a basic 
function, which is iterated multiple times. As shown in 
Figure 1 the basic function is iterated eight times. The 
first iteration operates on the input 64-bit plaintext 

10 block and the successive iterations operate on the 64-bit 
block obtained from the previous iteration. After the last 
iteration, a final transformation step produces the 64-bit 
ciphertext block* 

Figure' 1 shows the structure of the basic 

IS function. It involves three simple operations: bitwise 
exclusive-or, addition modulo 2^® (addition, ignoring the " 
overflow ") and multiplication modulo 2^^ + 1 
<multiplication, ignoring the " overflow For each 

iteration, the 64 -bit input block is divided into four 16- 

20 bitsub-blocks . In Figure 1, XI, X2, X3 and X4 denote the 
four 16-bit input sub-blocks used by the each iteration. 
The 64-bit block produced by each iteration is also 
constituted by four 16-bit sub-blocks. In Figure 1, yi(i), 
Y2 (i) , Y3(i) and Y4(i) denote the four sub-blocks resulting 

25 from the each iteration. The 128-bit key is divided into 52 
16-bit sub-keys (sub-key generation is discussed ahead) . 
Six sub-keys are used in each iteration and four sub-keys 
are used in the final transformation. In Figure 1, Zl(i), 
Z2(i), 23 (i), Z4(i), Z5(i) and 26 (i) denote the six sub- 

30 keys used in each iteration. The operations performed in 
the each iteration are: 
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1. Multiply sub-block XI (i) by sub-key Zl(i) 

2. Add sub-block X2(i) and sub-key Z2(i> 

3. Add sub-block X3(i) and sub-key Z3(i) 

5 4. Multiply sub-block X4(i) by sub-key 24 (i) 

5. XOR the results of (1) and C3) 

6, XOR the results of (2) and (4) 

7, Multiply the result of (5) by sub-key 25 (i) 

8. Add the results of (6) and (7) 

10 9. Multiply the result of (8) by sub-key 26 (i) 

10. Add the results of (7) and (9) 

11. XOR the results of (1) and (9) 

12. XOR the results of (3) and (9) 

13. XOR the results of (2) and (10) 
15 14. XOR the results of (4) and (10) 

The outputs of the iteration are the four 
sub-blocks produced by steps (11) to (14). The two inner 
sub-blocks from steps (12) and (13), Y2(i) to Y3(i), are 
swapped/ except for the last iteration* 

20 Figure 1 shows the structure of the final 

transformation. In this figure, Zl(9)to 24(9) denote the 
four 16-bit sub- keys and Yl to Y4 denote the four 16-bit 
sub-blocks of the 64-bit ciphertext block. The operations 
performed in the final transformation are: 

25 15. Multiply sxab-block XI by sub-key Zl(9) to obtain Yl 

16. Add sub-block X2 and sub-key Z2(9) to obtain Y2 

17. Add sub-block X3 and sub-key Z3(9) to obtain Y3 

18. Multiply sub-block X4 by sub-key Z4(9) to obtain Y4 

The encryption and decryption sub-keys are 
30 generated from the single • 128-^bit key. Encryption sub-keys 
are generated as follows. Initially, the 128-bit key is 
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divided into eight 16-bit sub-keys. Six of these sub-keys, 
Zl(l) to Z6(l), are used in the first iteration. The two 
remaining sub-keys, Zl(2) and Z2{2), are for the second 
iteration. The original 128-bit key is then rotated left by 

5 25 bits and the resulting key is again divided into eight 
16-bit sub-keys. Four sub-keys, Z3(2) to Z6(2), are grouped 
with Zl(2) and Z2<2) and destined to the second iteration. 
The other four sub-keys, Zl(3) to Z4(3), are to be used in 
the third iteration. Next, the key is again rotated left by 

10 25 bits, divided into eight 16-bit sub-keys and these sub- 
keys are grouped properly. This process is repeated each of 
the sub-keys for the eight iterations and for the final 
transformation have been generated. Decryption sxab-keys are 
calculated as either the additive or the multiplicative 

15 inverses of the encryption keys. 

As stated, the main goal in designing 
RiPCrypto is to obtain a device which would meet the 
performance requirements of applications in current and 
future high-speed data networks. This was achieved by 

20 including parallel execution techniques into the design of 
HiPCrypto's architecture. There are two opportunities for 
exploiting parallelism in the IDEA algorithm: in the 
execution of its basic function and in the iterations of 
this function. 

25 Examining the data flow shown in Figure 1, 

one can identify groups of operations that are data 
independent. In each group, one operation does not use the 
results produced by other operations in the group. The sets 
of independent operations are: the multiply and add 

30 operations in steps (1) to (4); the exclusive-or operations 
in steps (5) and (6); and the exclusive-or operations in 
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steps (11) to (14) . These independent operations can be 
performed simultaneously/ provided the architecture 
incorporates multiple functional units dedicated to the 
execution of each of them. 
5 By including multiple functional units in 

the architecture, we are making use of spatial parallelism. 
Temporal parallelism can also be employed in the execution' 
of the basic function, by overlapping in time the 
operations upon distinct plaintext blocks. .In this way, 
10 multiple blocks can be encrypted (or decrypted) 
simultaneously, instead of sequentially. This temporal 
parallelism was implemented with the pipeline shown in 
Figure 2. 

Stage 1 contains two add and two multiply 
15 units that perform in parallel the independent operations 
in steps (1) to (4) of the algorithm. 

Stage 2 contains two exclusive-or units to 
execute the operations in steps (5) and (6) in parallel. 

Stages 3, 4, 5 and 6 contain a single add or 
20 multiply unit and they execute, respectively, the 
operations in steps (7), (8)/ (9) and (10) of the 
algorithm. 

stage 7 has four exclusive-or units to 
execute steps (11) to (14) in parallel. 
25 The last stage has two add units and two 

multiply units and performs the algorithm's final 
transformation (see Figure 2) . This stage will be referred 
to as the output stage. 

In Figure 2, one can notice the inclusion of 
30 between stages of the pipeline. These queues temporarily 
hold data forwarded between non- adjacent stages. For 
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instancer stage 7 operates on sub-blocks from stages 1 and 
5 (see Figure 1) • 

A sub-block from stage 1 arrives five cycles 
before the corresponding siib^block . from stage 5, and 
5 during this time interval it remains in one of the queues 
connecting stages 1 and 7 . When the sub-block from stage 3 
is available, the sub-block in the front of the queue is 
dequeued and paired with the sub-block from stage 5. A 
queue is needed along the shortest path (in number of 

10 stages) between two non-neighbor stages. The size of each 
queue is indicated in figure 2, 

The final aspect in HiPCrypto's architecture 
concerns the generation and storage of the sxib-keys. To 
generate the encryption sub-keys, it would be necessary a 

13 circuitry for the rotation and sub-division of the 12 8-key - 
Moreover, the generation of the decryption sub- keys would 
require an arithmetic unit for the calculation of additive 
and multiplicative inverses. The inclusion of this 
additional hardware would only be reasonable if the key 

20 Changes very frequently, say, every few blocks. But that is 
not the common case in a private- key cryptosystem: 
typically, the key shared by a group of partners is changed 
in a long term basis (days or weeks, for example) « For this 
reason, only sub- key storage is provided- Sub-keys are 

23 generated externally by the host system, and then 
downloaded into the chip* 
Architecture of HIPCrypto 

The HIPCrypto architecture. Figure 3, 
executes a complete iteration of the algorithm. This 

30 architecture is composed of six 16-bit multipliers, six 
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16-bit adders and six 16-bit exclusive-or, memories for 
sub-key storage, buffers, tri-states and a control unit. 

The operations contained in each stage of the 
pipeline/ will be executed in an only machine cycle and 
5 since there are 7 pipeline stages, it will cipher (resp. 
decipher) 7 64 bits blocks for each execution of the 
algorithm. 

The HIPCrypto was designed to offer four 
kinds of configurations, ie, 1, 2, 4 or 8 integrates in 

10 series (table 3) . 

Each pipeline segment is executed in one 
clock cycle. For one chip configuration, seven 64 bits 
blocks are processed each 56 (7x8) machine cycles- For 2 
chips seven 64 bits blocks are processed each 28 (7 x 4) 

15 machine. For 4 chips configuration seven 64 bits blocks are 
processed each 14 (7x2) machine cycles. For 8 chips 
configuration seven 64 bits blocks are processed each 7 (7 
. x 1) machine cycles, that is to say, one 64 bits block for 
each machine cycle. 

20 The proposed HIPCrypto' s structure can be 

adapted to different uses. The adequate compromise between 
throughput and cost can be obtained by selecting the number 
of chips operating in series. 

The signals used for selecting the chip 

25 configurations were divided in two groups: three signals 
that will define the configuration cch <2:0> and three 
signals that will define the position of the chip into the 
chain pos <2:0>. Tables 4 and 5 show respectively the 
configurations and the possible positions. 

30 The sub-keys are stored in 4 RAMs according 

to Figures 3 and4« 
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For sub-keys 21 (i), Z2(i), 23(1) and 24 (i), a 
128 bits X 8 memory is used. The first 64 bits of each 
memory position/ least significant bits, store the cipher 
sub* keys (positions 0 to 63) and the last 64 bits, most 
5 significant bits (positions 64 to 127) , store the decipher 
sub-keys. The selection to execute the algorithm in cipher 
or decipher mode is made through the bus selection (see 
Figures 3 and 4) . 

For sub-keys Z5(i> and 26 (i)^ two 32 bits x 8 
10 RAMS are used, where the 16 least significant bits (0 to 
15), store the cipher sub-keys 25(i) and 26 (i), and the 16 
bits most significant store the decipher sub- keys. 

For the sub-keys 21(9)., 22(9), 23(9) and 
24(9), a 64 bits x 2 memory is used. 
15 Control Unit 

The control unit (see Figure 3) is the 
operational block that controls the operation of the 
architecture. This unit together with some extra circuits 
is responsible for the generation of the control signals. 
20 The main functions of this unit are described in the 
following. 

The control unit selects ciphering and 
deciphering modes, i.e. sleceting the cipher and decipher 
sub-keys respectively in each embedded memory. 
25 The control unit also allows the correct 

initialization, feeding and synchronization of the pipeline 
stages by generating all enables and reset signals for each 
internal block. 

The output stage will only be used by the 
30 last chip in each configuration. This selection is also 
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performed by the control unit through the selected 
configuration for each chip. 
HIPCrypto performance 

Table 7 shows some examples of the 
5 performance of HIPCrypto implemented in a two metal layer 
0,7 micron CMOS technology. 
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CLAIMS 

1. A METHOD FOR THE HARDWARE IMPLEMENTATION 
OF THE IDEA CRYPTOGRAPHIC ALGORITHM - HIPCrypto, patented 
in the USA under the no, US 052 14 703, that makes use of a 
5 seven stages pipeline to be implemented as a synchronous 
circuit, that will be referred as micro-pipeline, coupled 
to an output stage as described in figure 2; so that each 
stage of the pipeline supplies partial results for the 
following stage and receives partial results from the 
10 previous stage at each clock pulse of the synchronous 
circuit; ao that there exists a feedback from stage number 
7 to stage number 1, controlled by the digital control unit 
so that, for each of 16 rounds for ciphering a 64 bits 
block, the first stage of the pipeline is fed with partial 
15 results from the output of the seventh pipeline stage and 
that the pipeline is fed with a new block when the 
ciphering process is completed; and the sub-keys used in 
the data ciphering and deciphering processes, in agreement 
with the definition of IDEA the algorithm, is stored in 
20 four dedicated memory units. 

2- A METHOD FOR THE HARDWARE IMPLEMENTATION 
OF THE IDEA CRYPTOGRAPHIC ALGORITHM - HIPCrypto, in 
agreement with claim 1 in which the operations 1, 2, 3 and 
4 of the description of the IDEA cryptographic algorithmis 
25 executed by two 16 bits multiplier units and two 16 bits 
adder units; and these units compose pipeline stage number 
as described in figure 2; so that this stage receives, 
either a new 64 bits block from data input or a partial 
result from the seventh stage and the ciphering or 
30 deciphering sub-keys corresponding to this stage as 
described in the figures 2 and 3; so that the inputs and 
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outputs of this stage are connected to input and output 
registers respectively. 

3, A METHOD FOR THE HARDWARE IMPLEMENTATION 
5 OF THE IDEA CRYPTOGRAPHIC ALGORITHM ~ HIPCrypto, in 

agreement with' claim 1 in which the operations 5 and 6 of 
the description of the IDEA cryptographic algorithm are 
executed by two 16 bits exclusive-or units; and these units 
compose stage number two of the pipeline as described in 
10 the figure 2; so that this stage receives partial results 
from the first stage as described in the figures 2 and 3; 
so that the inputs and outputs of this stage are coneected 
to input and output registers respectively, 

4. A METHOD FOR THE HARDWARE IMPLEMENTATION 
15 OF TkE IDEA CRYPTOGRAPHIC ALGORITHM - HIPCrypto, in 

agreement with claiml in which the operation 7 of the 
description of the IDEA cryptographic algorithm is executed 
by a 16 bits multiplier unit; and this unit composes stage 
number three of the pipeline as described in the figure 2; 

20 so that this stage receives partial results from the 
previous stage and ciphering and deciphering sub-keys 
corresponding to this stage as described in figures 2 and 
3; and that the inputs and outputs of this stage are 
connected to input and output registers respectively, 

25 5. A METHOD FOR THE HARDWARE IMPLEMENTATION 

OF THE IDEA CRYPTOGRAPHIC ALGORITHM - HIPCryptO, in 
agreement with claiml in which the operation 8 of the 
description of the IDEA cryptographic algorithm is executed 
by a 16 bits adder unit; and this unit composes stage 

30 number four of the pipeline as described in the figure 2; 
so that this stage receives partial results of the stages 2 
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and 3 as described in the figure 2; and that the inputs and 
outputs of this stage are connected to input and output 

registers respectively- 

6. A METHOD FOR THE HARDWARE IMPLEMENTATION 

5 OF THE IDEA CRYPTOGRAPHIC ALGORITHM - HIPCrypto, in 
agreement with claiml in which the operation 9 of the 
description of the IDEA cryptographic algorithmis executed 
by a 16 bits multiplier unit; and this unit composes stage 
number five of the pipeline as described in figure 2; so 

10 that this stage receives partial results from the previous 
stage and ciphering and deciphering sub-keys corresponding 
to this stage as described in the figures 2 and 3; and that 
the inputs and outputs of this stage are connected to input 
and output registers respectively. 

j5 7. A METHOD FOR THE HARDWARE IMPLEMENTATION 

OF THE IDEA CRYPTOGRAPHIC ALGORITHM - HIPCryptO, in 
agreement with claiml in which the operation 10 of the 
description of the IDEA cryptographic algorithm is executed 
by a 16 bits adder unit; and this unit composes stage 

20 number six of the pipeline as described in the figure 2; so 
that this stage receives partial results of stages 3 and 5 
as described in figure 2; and that the inputs and outputs 
of this stage are connected to input and output registers 
respectively. 

25 8. A METHOD FOR THE HARDWARE IMPLEMEaHTATION 

OF THE IDEA CRYPTOGRAPHIC ALGORITHM - HIPCrypto, in 
agreement with claiml in which the operations 11, 12, 13 
and 14 of the description of the IDEA cryptographic 
algorithm are executed by four 16 bits exclusive-or units; 

30 and these imits compose stage number seven of the pipeline 
as described in figure 2; so that this stage receives 
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partial results from stages 1, 5 and 6 as described in the 
figure 2; and that the inputs and outputs of this stage are 
connected to input and output registers respectively. 

9. A METHOD FOR THE HARDWARE IMPLEMENTATION 
5 OF THE IDEA CRYPTOGRAPHIC ALGORITHM - HIPCrypto, in 

agreement with claiml in which the operations 15, 16, 17 
and 18 of the description of the IDEA cryptographic 
algorithm are executed by two 16 bits multiplier units and 
for two 16 bits adder units; and these units compose the 

10 output stage of the pipeline as described in figure 2; so 
that this stage receives partial results from stage 7 of 
the pipeline and ciphering and deciphering sub-keys 
corresponding to the output stage as described in figures 2 
and 3; and that the inputs and outputs of this stage are 

15 connected to input and output registers respectively. 

10. A METHOD FOR THE HARDWARE IMPLEMENTATION 
OF THE IDEA CRYPTOGRAPHIC ALGORITHM - HIPCrypto, in 
agreement with claiml, in which the sub-keys used in 
ciphering and deciphering process are stored in four 

20 dedicated memories according to figure 4 as follows: 
ciphering sub-keys Zl(i), Z2(i), Z3(i> and Z4(i) (i from 1 
to 8> and deciphering sub-keys 21(1), 22 (i)., Z3(i) and 
Z4(i) (i from 1 to 8) stored in a 128 bits x 8 memory; 
ciphering sub-keys Z5(i) (i from 1 to 8) and deciphering 

25 sub-keys 25 <i) (i from 1 to 8) stored in the first 32 bits 
X 8 memory; ciphering sub-keys Z6(i) (i from 1 to 8) and 
deciphering sub-keys Z6(i) (i from 1 to 8) stored in the 
..second 32 bits x 8 memory; ciphering sub-keys Zl(9), Z2(9), 
Z3(9) and Z4 (9) and deciphering sub-keys Zl(9), Z2(9), 

30 Z3(9) and Z4(9) stored in the 64 bits x 2 memory. 
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11. A METHOD FOR THE HARDWARE IMPLEMENTATION 
OF THE IDEA CRYPTOGRAPHIC ALGORITHM - HIPCrypto, in 
agreement with claims 1 and 2, in which a second pipeline 

. level, denominated macro-pipeline, allows the concatenation 
5 of 2, 4 or 8 circuits operating with a micro-pipeline of 
seven stages as indicated in the table 3. 

12, A METHOD FOR THE HARDWARE IMPLEMENTATION 
OF THE IDEA CRYPTOGRT^HIC ALGORITHM - HIPCrypto, in 
agreement with claims 1 and 2, in which " first- in first- 

10 out "(FIFO) memories are used in order to synchronize the 
data provenient from non adjacent stages in the following 
way: a 64 bits x 5 positions FIFO connecting stages 1 and 
1, a 16 bits X 2 positions FIFO connecting stages 3 and 6, 
a 16 bits X 1 position FIFO connecting stages 2 and 4^ a 16 

15 bits X 1 position FIFO connecting stages 5 and 7, as 
described in the figure 3. 
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DRAWINGS 



Table 1 

DES Algorithm velocity implemented in 
different plataforms. 



Plataforms 


Frequency (MHz) 


Troughput (cipher) 


8088 


4,7 


23,68 Kbps 


68000 


7,6 


57,6Kbps 


80286 


6 


70,4 Kbps 


68020 


16 


224 Kbps 


68030 


16 


249,6 Kbps 


80386 


25 


320 Kbps 


68030 


50 


640 Kbps 


68040 


25 


1,024 Mbps 


68040 


40 


1,472 Mbps 


80486 


66 


2,752 Mbps 


HP900/887 
SunELC 

HyperSparc 

RS6O00-35O 

Sparc 10/52 

DEC Alpha 4000/610 


125 


10,816 Mbps 
1,664 Mbps 
2,048 Mbps 
3,392 Mbps 
5,376 Mbps 
9,856 Mbps 
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Table 2 

Commercial integrates of DBS Algorithm 







P'nHricntinn V6dr 


Freouencv 


Troughput 


AMD 


Amy!) lo 


1 Qft 1 

lyoi 




10 4 Mhos 


AMU 


Mjnyjoo 




4 MH7 


12 Mbps 


AMD 


AmZoUOo 




4 K/rH7 


116 Mhos 


Alccl 




1 70^ 


7 


15.2 Mbps 


Cfc'-'intosys 


ouperv^rypi 






100 Mhos 


CE-tnfbsys 


SuperCrypt 
CE99C003A 


1994 


30 MHz 


160 Mbps 


Cryptech 


Cryl2C102 


1989 


20MHz • 


22,4 Mbps 


Newbridge 


CA20C03A 


1991 


25 MHz 


30,8 Mbps 


Newbridge 


CA20C03W 


1992 


8MHz 


5,12 Mbps 


Newbridge 


CA95C68/18/09 


1993 


33 MHz 


1 17,4 Mbps 


Pijnenburg 


PCClOO 


7 


7 


20 Mbps 


Semaphore 


Roadninner284 


? 


40Mhz 


284 Mbps 


Communications 










VLSI 


VM007 


1993 


32 MHz 


160 Mbps 


technology 








112 Mbps 


VLSI 


VM009 


1993 


33 MHz 


technology 








512 Mbps 


VLSI 


6868 


1995 


32 MHz 


technology 











Table 3 



Configuration 


Iteration numbers for configuration 


1 integrate 

2 integrates 
4 integrates 
8 integrates 


8 iterations 

4 iterations in each integrate 
2 iterations in each integrate 
1 iteration in each integrate 
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Table 4: 



Configuration 
ch2 chl chO 

1 integrate. 

2 series-connected integrates. 
4 series-connected integrates. 

8 series-connected integrates. 

Table 5: 



pos 
2 


pos 

1 


posO 


Integrate position. 


0 


0 


0 


First position 


0 


0 


1 


Second position 


0 


1 


0 


Third position 


0 


I 


1 


Fourth position 


1 


0 


0 


Fifth position 


1 


0 


! 


Sixth position 


1 


1 


0 


Seventh position 


1 


1 


1 


Eighth position 



Table .6: 





Configuration 




Position 




och2 


cchi 


echo 


pos2 


posl 


posO 


0 


0 


0 


0 


0 


0 


0 


0 


1 


0 


0 


0 








0 


0 


1 


0 


1 


0 


0 


0 


0 








0 


0 


1 








0 


1 


0 








0 


1 


1 


1 


0 


0 


0 


0 


0 








0 


0 


1 








0 


1 


0 








0 


I 


1 








1 


0 


0 








1 


0 


1 








1 


1 


0 








1 


1 


1 
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Table 7: 



Macro-pipeline 
Configurations 


Performance 
Clock frequency: (59 MHz) 






1 inte^ate 


424 Mbps 


2 integrates 


848 Mbps 


4 integrates 


IjGbps 


8 integrates 


3 A Gbps 
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X1 X2 



first ^ 
iteration 



J >£ Y3 V4 




Figure 1 
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Z4(lMO STAGE 1 



STAGE 2 



STAGE 3 



STAGE 4 



STAGE 5 



STAGE 6 



STAGE 7 



Y1 Y2 Y3 



24«)-i»(T) OUTPUT STAGE 

Y4 



OUTPUT AFTER 8 INTERATIONS 



Figura2 
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Basic 
structure 



Data imput 




Stage 1 







Stage 2 

n: 



Stage 3 



stage 4 



Stages 



stage 6 

131 



Stage? 



A 



output 



Data output 



^1 



Memory 
128x8 



(i) V) (0,(0 
Z1,Z2,Z3,Z4, 

zr,22'.Z3'e Z4'. 
1=1 8 



16 



X 



Memory 
32xS 



Z5 e Z5'. 
1= 1 8 



Memdria 



Z6 e Z6'. 
i= 1 8 



64 



Memory 
64x2 



<9) (9) (9) (9) 
Z1,Z2. Z3,Z4. 

(9) (9> (9) ,,.0) 

Zr.Z2',Z3* e Z4'. 



Figure 4 
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